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THE DISTRIBUTION OF THE SPINE OF A FLEMING-VIOT TYPE 

PROCESS 


MARIUSZ BIENIEK AND KRZYSZTOF BURDZY 

Abstract. We show uniqueness of the spine of a Fleming-Viot particle system under 
minimal assumptions on the driving process. If the driving process is a continuous time 
Markov process on a finite space, we show that asymptotically, when the number of 
particles goes to infinity, the branching rate for the spine is twice that of a generic 
particle in the system, and every side branch has the distribution of the unconditioned 
generic branching tree. 


1. Introduction 

It is well known that, under suitable assumptions, a branching process can be decom¬ 
posed into a spine and side branches. A detailed review of the relevant literature is 
presented in [HJ, Sect. 2.2], The “Evans’ immortal particle picture” was introduced in [9]. 
Another key paper in the area is mi- Heuristically speaking, the spine has the distri¬ 
bution of the driving process conditioned on non-extinction, the side branches have the 
distributions of the critical branching process, and the branching rate along the spine is 
twice the rate along any other trajectory. 

We will prove results for the Flcming-Viot branching process introduced in [5] that have 
the same intuitive content. Our results have to be formulated in a way different from the 
informal desscription given above for two reasons. The first, rather mundane, reason 
is that the Flcming-Viot branching process has a different structure from the processes 
considered in [SJ Sect. 2.2]. A more substantial difference is that for a Flcming-Viot 
process with a fixed (finite) number of particles, the distribution of the spine does not 
have an elegant description (as far as we can tell). On the top of that, unlike in the 
case of superprocesses, the limit of Flcming-Viot processes, when the number of particles 
goes to infinity, has not been constructed (and might not exist in any interesting sense). 
Hence, our results will be asymptotic in nature. We will show that the limit of the 
spine processes, as the number of particles goes to infinity, has the distribution of the 
driving process conditioned never to hit the boundary. We will also prove that the rate 
of branching along the spine converges to twice the rate of a generic particle and the 
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distribution of a side branch converges to the distribution of a branching process with the 
limiting branching rate. 

Onr main results on the asymptotic spine distribution are limited to Fleming-Viot 
processes driven by continuous time Markov processes on finite spaces. We conjecture 
that analogous results hold for all Fleming-Viot processes (perhaps under mild technical 
assumptions). 

The paper is organized as follows. Section [2] contains basic definitions. It is followed by 
Section [3] proving existence of the spine under very weak assumptions, thus significantly 
strengthening a similar result from [TO] . Section [4] shows that a historical process, in the 
spirit of [7], can be represented as a Fleming-Viot process and satisfies an appropriate 
limit theorem. Section [5] contains the main theorems on the distribution of the spine, its 
branching rate, and its side branches. Section [6] shows by example that the results on the 
spine distribution must have asymptotic character because they do not necessarily hold 
for a process with a fixed number of particles. 

2. Basic definitions 

Our main theorems will be concerned with Fleming-Viot processes driven by Markov 
processes on finite state spaces. Nevertheless we need to consider Fleming-Viot processes 
with an abstract underlying state space because our proofs will be based on “dynamical 
historical processes” which are Fleming-Viot processes driven by Markov processes with 
values in function spaces. 

Let E be a topological space and let F be a Borel proper subset of E. We will write 
F c = E \ F. Let Y t , t > 0, be a continuous time strong Markov process with state space 
E whose almost all sample paths are right continuous. For s > 0, let 

t F}S = inf {t>s:Y t e F c } , 

and assume that t f , s is a stopping time with respect to the natural filtration of Y for all 
s > 0. We assume that F c is absorbing, i.e., Y t = Y TFs for all t > r FiS , a.s. 

In most papers on the Fleming-Viot process, either Y is a diffusion in an open subset 
F C W l or Y is a continuous time Markov process and E is a countable set, so r F is a 
stopping time in those cases. We recall here that the hitting time of a Borel subset of a 
topological space by a progressively measurable process is a stopping time (see, e.g., Bass 

HD- 

We will use 6 to denote the usual shift operator but in this section and Section [3] we do 
not assume that the transition probabilities of Y are time homogeneous. We will always 
make the following assumptions. 

(i) P (s < t f ^ s < oo | Y s — x) = 1 for all x 6 F and s > 0. 

(ii) For every x G F and s > 0, the conditional distribution of r FtS given {Y s = has 


no atoms. 
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Consider an integer N > 2 and a family {U k , 1 < i < N, k > 1} of jointly independent 
random variables such that Ul has the uniform distribution on the set {1 ,... ,N} \ {i}. 

We will use induction to construct a Flenring-Viot type process Xf = (X t l ,..., X t Y ), 
t > 0, with values in F N . Let r 0 = 0, suppose that (X ( y 1 ,..., A"q ,7V ) G F n , and let 

t> 0 , ( 2 . 1 ) 

be independent and have transition probabilities of the process Y. We assume that 
processes in (12.ip are independent of the family {U k , 1 < i < N, k > 1}. Let 

Ti = inf {t > 0 : 3i<i<7v Xf 1 G F c } . 

By assumption (ii), no pair of processes can exit F at the same time, so the index i in 
the above definition is unique, a.s. 

For the induction step, assume that the families 

xi’\...,xi’ N , t> o, 


and the stopping times Tj have been defined for j < k. For each j < k , denote by ij the 
unique index such that X^ 3 G F c . Let 

X k+l,m = X k,m for m 


and 


A'Y 1 * = X Tk 


k ’ u k 


Let the conditional joint distribution of 




t > T k , 


given 0 < t < T k ,l < m < iv| and {U l k , 1 < i < N, k > 1}, be that of N inde¬ 

pendent processes with transition probabilities of Y, starting from X^ +1,m , 1 < m < N. 
Let 

r k+ i = inf | t>r k : 3 1 < i < N Xt +1 ’ 1 G F c | . 

We define Xf := (Xf ..., X t N ) by 

X" 1 = Xy m , for Tfc_i <t<r kl k > 1, m — 1, 2,..., X. 

Note that the process X w is well defined only up to the time 


r 0 o ■= lim r k 

k —^oo 

which will be called the lifetime of X iV . We do not assume that Too = oo, a.s. 
We will suppress the dependence on N in some of our notation. 
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2.1. Dynamical historical processes. The concept of a dynamical historical process 
(DHP) was introduced in [ID, p. 355] under a different name. We chose the name “dynam¬ 
ical historical process” because the concept of DHP is based on an intuitive idea similar 
to the “historical process” (see 0 )- Heuristically speaking, for each n G {1, ...,1V}, 
< s < t} represents the unique path in the branching structure of the Fleming- 
Viot process which goes from X™ to one of the points A 7 ]},... ,Xq along the trajectories 
of X 1 ,..., X N and does not jump at times r k . Note that the process Y may have jumps 
so a dynamical historical process {H™(s), 0 < s < t} is not necessarily continuous. 

Let A be the family of all sequences of the form ((ai, bi), (a 2 , b 2 ),..., (a k , bk)), where 
a* G {1,..., N} and 6 N for all i. For a sequence a = ((cp, bi), (a 2 , b 2 ), ..., (a/., b k )) 
we will write a + (m, n) to denote ((ai, bi), (a 2 , b 2 ), ..., (a k , b k ), (m , n )). We will define 
a function C : {1,...,1V} x [0,Too) —> A. We interpret C(i,s ) as a label of X l s so, by 
abuse of notation, we will write £(A"j) instead of C(i,s). We let jC(X 1 s ) = ((1,0)) for all 
0 < s < Ti and 1 < i < N. If £( X l s ) = a for r k -1 < s < r k , i ^ ik and i ^ Ul k then we 
let C(X l s ) = a for r k < s < r k+ \. Suppose that i = U l k k and £{X l s ) = a for r k _i < s < r k . 
Then we let £(Alj) = a + (1, k) and C(X l s k ) = a + (i k , k) for r k < s < r k+ \. 

Later in the paper we will consider a branching process whose individuals are elements 
of the set £({ 1 ,..., A^} x [ 0 , Too)). A sequence a 2 will be considered an offspring of oq if 
Qg = oli + (m, n) for some m and n. 

Suppose that £( X™) = ((ai, 6i), (a 2 , 62), ■ ■ ■, (afc, bk)) for some k > 1 . From the defi¬ 
nition of C we easily infer that 0 = 61 < b 2 < ... < b k and T& fc < t, so that 0 < < 

• • • < U k < t < r bk+1 . For T bm < s < r bm+1 with 1 < m < k we define y(n, t, s) = a m and 
H t n (s) = Xg m , and for r bk < s < t we define y(n, t, s) = a k and iL”(s) = X^ k . Note that 
= Xf l ' n " t,s> and y(n, t, t ) = n for all 1 < n < N and 0 < s < t. 

We will call {HJ l (s), 0 < s < t} a dynamical historical process (DHP) corresponding to 
X™. Note that H™ is defined only for 1 < n < N and 0 < t < Too- 

We will say that a branching event occurred along Hf on the interval [si,s 2 ], where 
0 < Si < s 2 < t, if there exist s G [si,s 2 ] and j 7 ^ k such that y(j, t, s) = y(fc,f, s) and 
X{j,t,s 2 ) ± x(k,t,s 2 ). 

3. Existence and uniqueness of the spine 

The spine process will be defined below the statement of Theorem 13.11 Roughly speak¬ 
ing, the spine is the unique DHP that extends from time 0 to time r^. The existence 
and uniqueness of the spine was proved in da Thm. 4] under very restrictive assump¬ 
tions on the driving process Y and under the assumption that the lifetime Too is infinite. 
We will prove that the claim holds under minimal reasonable assumptions, that is, the 
strong Markov property of the driving process and non-atomic character of the exit time 
distributions. 
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Theorem 3.1. Fix some N > 2, suppose thatY satisfies assumptions (%)-(%%) in Section [1 
and X(/ G F N , a.s. Then, a.s., there exists a unique infinite sequence ((ai, bfi, (a 2 , b 2 ),...) 
such that its every finite initial subsequence is equal to C{X l s ) for some 1 < i < N and 
s > 0. 


In the notation of the theorem, we define the spine of X jV by J(s) = J N (s ) = Xf rn 
for r bm < s < Tft m+1 , m > 1. We also let C(J,s) = £(A"“ m ) and x(J, s) = a m for 
r bm <s< r bm+1 , m > 1. 


Proof of Theorem \S.l\ Step 1. We start with a simple estimate. Consider a probability 
space (hi, J 7 , P), a cr-field Q C F and an event A G F. Suppose that P(A) > p and let 
V = F(A | Q). Then 


P < EC = E (Vl {v<p/2} ) + E {Vl {v > p/2} ) < P (v < |) + 1 • P (V > |) 


< - + p 
- 2 



This implies that 


P(P(A I Q) > p/2) = P(y > p/2) > p/2. 


(3.1) 


In the rest of the proof, P will refer to the probability measure on the probability space 
where X iV is defined. Let rfi = inf {t > 0 : X l t G F c } and = median (rf) for 1 < i < N. 
Note that since the distribution of each t’ f has no atoms, each is uniquely determined. 
The median rri, depends only on Xf Let j be the index of the particle with the maximal 
median of the exit time from F (we choose the smallest of such indices if there is a tie). 
In other words, j is the smallest number satisfying m- 3 = maxi<j< 7 v m*. 

Recall the definition of r*. from Section [21 Let i* be a function of i defined by rfi = Ti * 
and let 

A> = n i T F - = j} ’ 

A" = (4 > mj}, 

A = A' n A". 


The following estimate holds for any X(/ G F N , 

1 / 1 \ N ~ 1 

P(A') = n [ p w < ’"j) p (U = j)] > yvrr ( ftt) ='P > °- 

The events A' and A" are independent and ¥{A") = 1/2 so P(A) > p/2 =: p\. 

Let F t = a{X^,s < t). Note that all r/, 1 < i < N, are distinct, a.s., because the 
hitting time distributions have no atoms. Let r 1 < r 2 < • • • < f ^ -1 be the ordering of 
the set {rf,i j}. Let k(i) be defined by F = t f ‘\ 
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Since P(hL) > pi, we obtain from (13.11) . 


P(P(A I JVO >Pi/ 2) >pi/2. 

Let B\ = {P(H | J-fi) > Pi/2} and C\ = o 9 ? i > mj — r 1 }. If L>i holds then 
1 =j } = P(^)l = j | > P(^' | J 7 ?!) > P (^4 | F 9 i) > pi /2 > 0 . 

So if Bi holds then | = j| holds as well. This and the fact that the processes 
{X^' ] \ t- > r 1 } and {X^, t > r 1 } are conditionally i.i.d. given Wy imply that on the event 

Bi, 


P(C' 1 D A | > (pr/2) 2 . 


We have P(I?i) > Pi/2, so 


P^fli) > (pr/2) 3 . 


Next we will apply induction. Let 

p n = (p n _i/2) 3 , n > 2, 

n 

C n = P){r^*' ) o Q~i > mj — t‘ }, 2 < n < N — 1, 

= {P((7 n _i n A I JV,) > p n /2}, 2 < n < iV — 1. 

Suppose that for some 1 < n < N — 2, 


(3.2) 


P (c„ ni)> p n+1 . 

Note that the above inequality holds for n = 1, by (13.2p . This induction assumption and 
(13.ip imply that, 


P(-£?n+l) — P (P ( C n D A | T pn+l) > p n+ i/2) > Pn+i/ 2. 


If B n+ 1 holds then 


h 


jjk(n+l) _ .'l 

U k(n+l)* f 


p / T jk(n+l) 
r \ U k(n+ 1)* 


j | J-fn+l) > P(W | J~1fn+ 1) > P(H | J-fn+l) > p n+ l/2 > 0. 


Hence if H n+ i holds then | i'y = j j holds as well. This and the fact that the processes 
{Xf( n+1 ), t > r' l+1 } and {Xf,t > r n+1 } are conditionally i.i.d. given imply that on 

the event B n+ 1 , 


P(Cn+l (3-^4 | J-fn+l) > (Pn+l/2) 2 . 


We have F(B n+1 ) > p n+ 1/2, so 


P(Cn+l nd)> (Pn+l/2) 3 . 
This finishes the induction step. We conclude that 

P(CV-i H A) > (pn- i/ 2) 3 =: g, 


(3.3) 
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where q > 0 depends only on N. 

If Cat-i C\A holds then all DHP paths //” .. 1 < n < N, must pass through Xi x , so they 
all agree with Hi x on the time interval [ 0 , 7 ^] in the sense that x(n, mj ,s) = y (j, my s) 
for all s E [0, r 1 ] and 1 < n < N. 


Step 2. Let a 0 = 0, ji = j, m- n i = mj and ay = A rfi. 

Suppose that j n and a n have been defined for n = 1 We dehne Tp k+1 = 

inf {f > a k : XI E F c } and rrii^+i to be the median of the conditional distribution of 
Tp k+ 1 — a k given a k for 1 < i < N. Let j^+i be the smallest number satisfying mj fc+1) fc+i = 
maxi<j<jy . Let 

o-fc+1 = (m jfe+lifc+ i + cr k ) a Tp + 

It is easy to see that a k < a.s., for all k > 1. Let C^r_ x and A k be defined in the way 

analogous to C/v_i and A but relative to and Let D k be dehned by the condition 

1 D k = 1 ° 0**-! for k > 1. 

By (13.3p and the strong Markov property of X*^ applied at time a k -i, for each k > 2, 

W{Dl\Dl,... x DU)<l-q. 


It follows that for every k > 2, 



= 1 - P {D{)¥{D c 2 
>i-(i-«)*. 




So if .D* = ur=i then P ( D *) = 1, i.e., almost surely at least one of the events D k 
occurs. For any m > 1, the same claim applies to the process X w after time r m , by the 
strong Markov property, so if G m - {Id, ° 0 Trn = 1}, then P(G m ) = 1 for all m > 1, and, 
therefore, 


P 



= 1 . 


(3.4) 


Fix any m > 1. By (13.4p . with probability 1, there exists k such that 1 p k ° = 1 

and we let k denote the smallest integer with this property. Let = a k o 0 Trn + r m , 

k > 1. The last remark in Step 1 implies that all DHP paths H n (m) , 1 < n < N, must 

% 

agree on the time interval [0, r m ], a.s., that is, y(j, r] k n \ s) = y(l, rf™' 1 , s) for all s E [0, r m ] 
and 1 < j < N. For any t < t 00 we hnd a random m such that t < r m and k such 
that 1 D k ° Q Tm = 1- Suppose that C fx\ m) 'j = ((ci, di),..., (c n , d n )). It is easy to check 
that if we let a m = c m and b rn = d m for 1 < m < n such that Td m+1 < t then this 
dehnition is consistent when we vary t over [ 0 ,Too). We have dehned a unique sequence 
((di, bi), ( 02 , 62 ), ■ ■ ■) satisfying the theorem. □ 
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4. Dynamical historical process as a Fleming-Viot process 


We will write {Y t , 0 < s < t} to denote the process Y conditioned by Tp,o > t. In 
this section, we prove that, as the number of particles N goes to infinity, the empirical 
distribution of DHPs at time t converges to the distribution of the trajectory of the process 
Y conditioned by tf ,o > t. For technical reasons we impose two extra assumptions on X; 
they will stay in force for the rest of the paper. We assume that t^ = oo, a.s., for all N, 
and that the process Y is time homogeneous. If the driving process is Brownian motion in 
a Lipschitz domain with the Lipschitz constant less than i damn or Brownian motion 
in a polytope with N = 2(0), then Too = oo, a.s. However, it was proved in 0 that 
Too < oo, a.s., for every N, for some Fleming-Viot processes driven by one-dimensional 
diffusions. Crucially for the rest of our paper, it is easy to see that if the driving process 
is a continuous time Markov process on a finite space then Too = oo, a.s. 

Let _D([0, t], E ) denote the usual Skorokhod space of cadlag functions with values in E. 
Let Hf r = (Hj ,..., H^), where Et* is DHP of Xf. Let X t N and Ef T denote empirical 
distribution of and H^, resp., i.e., 

1 N 

^) = ivE^(^ for AcB, 

V k =1 

1 N 

WfM) = hr A c 0(10,t],E). 

V k=1 

Let P* and E A denote the probability distribution and the corresponding expectation for 
the process X N , assuming that the empirical distribution of X^ is X. 

Theorem 4.1. Assume that Xq =$■ X as N —y oo for some probability measure X on F 
and Too = oo, a.s., for all N. Then for every fixed t > 0 and continuous bounded function 
f on D([0,t\, E), when N -A oo, in probability, 

n^f)^E x (f({Y s \0<s<t})). 


Proof. The theorem follows rather easily from a result of Villemonais [T2] but we have to 
reformulate the problem to be able to apply that theorem in our setting. Specifically, we 
have to represent DHP as a time-homogeneous Markov process. 

For y G _D([0, t], E) and t > 0 let 



y(s), for s<t, 
y{t), for s>t, 


and note that y® G D([0, oo), E). Let Z t = (t,Y^y Note that the process Z is a 
time-homogeneous Markov process with trajectories in the space D* := D([0,oo),M + x 
D([0, oo), E )). Let 
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The set F z is absorbing for Z in the sense that if Z s G F% and s < t then Z t G but 
it is not true that Z t = Z s . 

Let iif = (Hi ,..., H^), where Hf(s) = (t, Hf(s )) for s < t and Hf(s) = ( t , Hf(t)) for 
s > t. Then is a Fleming-Viot process in M + xD([0, oo), E ) based on Z. A “particle” 
in this process jumps to the location of another particle when it hits F z . For t > 0 we 
dehne the empirical distribution Hf of as 

1 N 

«,"(A = vp E A‘C). A C K+ X D([0, oo), E). 

V k =1 

Note that for A C .D([0, oo), E) we have ^^(1^+ x A) = 'H^(A). 

By m Thm. 1], for every fixed t > 0 and continuous bounded function g on D *, when 
N —» oo, in probability, 

H?{g) -A E(^(Z t ) \Z s eF z ,0<s<t). 

This is essentially the assertion of the theorem, cloaked in a different formal statement. □ 

Remark 4.2. For later reference we state m Thm. 1]. This claim may be also considered 
a corollary of Theorem 14.11 Assume that Xq X as A^ —» oo for some probability 
measure X on F. Then for every fixed t > 0 and continuous bounded function / on E , 
when N —> oo, in probability, 

x t N (f) EX (f( Y t))- 

5. The asymptotic distribution of the spine 

For the remaining part of the paper we assume that Y is a time-homogeneous continuous¬ 
time Markov chain with finite state space E = {0,1,..., n}. We choose {1,..., n} to play 
the role of F. We assume that A is a communicating class in the sense that for all 
x,y G F, there is a positive probability that Y will visit y before hitting 0 if it starts from 
x. Recall that ,/, Y denotes the spine process defined after the statement of Theorem 13.11 
Recall that {W, 0 < s < t} is the process Y conditioned by rp,o > t. Let Y°° denote 
the process Y conditioned never to leave F. The process Y°° can be described as the 
spatial component of the space-time Doob’s h -process obtained from {(t,Y t ),t > 0} by 
conditioning by the parabolic function h which is 0 on F c and grows to infinity on F. 
Alternatively, we may define the distribution of Y°° as the limit, as t —» oo, of distributions 
of Y t . We will not provide a more formal construction of Y°° because it does not pose 
any technical challenges in our context. 

Theorem 5.1. Consider a probability measure X on F and suppose that X^ =$■ X as 
N —x oo. The distribution of J N converges to the distribution of Y°° with the initial 
distribution X when N —» oo. 
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Proof. Consider any t > 0. By Theorem 14.11 the empirical distribution of the dynamical 
historical paths of at time t converges to the distribution of {Y*(s), 0 < s < t}, when 
N —> oo. Since the set F is finite, this implies that for every fixed the empirical 

distribution of DHPs which end at x at time t converges, as N —> oo, to the distribution 
of {Y 4 (s), 0 < s < t} conditioned by {Y*(f) = x}. 

Fix any u > 0 and a sequence Sk —>■ oo. The results of [jS] (see especially (1.1) and 
Section 4) can be used to show that for any Xk G F, the distributions of {Y s, = (s),0 < 
s < u} conditioned by {Y Sfc (sfc) = Xk} converge, as k — * oo, to the distribution of 
(Y°°(s), 0 < s < u}. 

Since the state space F is finite, we can use the diagonal method to show that for any 
sequence N m going to infinity we can find a subsequence of N m such that for some 
Px,k > 0, we have for all x and k, 

lim P (Jjp* = x) = p X M. (5.1) 

m—>oo K 

N* 

Given X. Sfc '", every DHP which ends at x at time Sk has the same probability of being the 
initial part of the spine. This observation, (I5.ip and the earlier remarks on the convergence 
of DHPs and convergence of {Y s,: (s),0 < s < u} conditioned by { Y Sk (sk ) = Xk} imply 
that the distribution of {J s m ,0 < s < u} converges, as m —>■ oo, to the distribution of 
{Y°°(s),0 < s < u}. Since u is arbitrary and is a subsequence of any sequence N m , 
the theorem follows. □ 

Let q xy denote elements of the transition rate matrix Q for the process Y and let 

a, = ^r t (r t * = </)<&„. ( 5 . 2 ) 

y£F 

Let M™ be the number of times that the process X m branched before time t. More 

formally, in the notation of Section [2j M™ is the number of k such that Tk < t and 

K = 77i. Let Mjf be the number of times that the process J branched before time t. 
More precisely, let Mf be the number of k such that Tk < t and either %k = xW T k ) or 

U l k k = X(J, r k ). 

Proposition 5.2. Assume that Xf 1 X as N —> oo for some probability measure X on 
F. For every fixed m, the distribution of M m converges to the distribution of the Poisson 
process with variable intensity X t as N —> oo. 

Proof. Every process M m is a Poisson process with variable random intensity equal to 
SygF %-t (y)q y o at time t. Fix any t > 0. Definition (15.2|) together with finiteness of F 
imply that it will suffice to prove that, for all £i,pi > 0, 

lim sup P ( sup V | X*(y) - P' V (Y/ = y) \ > £i ) < pi. (5.3) 

TV-aoo \sG[0 A yeF J 
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Since F is finite, it will be enough to prove that, for all £i,pi > 0 and y E F, 
limsupP ( sup | X^(y) — P^Y/ = y)\ > J < p\. 

tv-kx) yse[o,t] J 

Suppose to the contrary that there exist Pi,£i > 0 and y E F such that 

limsupP I sup | X^(y) — P^Y/ — y) \ > £i J > pi- 
N-t oo yse[0,t] J 

The set F has cardinality n and (v) = '(X/ = y) = 1 so the above 

assumption implies that there exist pi,£i > 0 and y* E F such that 


limsupP [ sup (T s ,v (y*) — P^Y/ = y*)) > £ 1 /n J > pi. 
N—loo V sg[0,t] 


(5.4) 


It is easy to see that for any £i,pi > 0 one can hnd 6 E (0 ,t) so small that for every 
y E F and s > 0 the following holds. 

(1) If the number of k such that = y is greater than or equal to j then with 
probability greater than 1 — pi, the number of k such that X* = y for all u E 
[s, s + <5] is greater than j( 1 — £ 1 / (3 n)). 

(2) For all u E [s, s + 5], 

|P x tX; = y)~ P X (Y: = »)| < £i/(3rc). 

Let 


T = inf {s > 0 : X a N (y*) - P*(Y/ = y*) > Ei/n) , 
k\ = infjfc > 0 : kd > T}. 

By (15.411 and the strong Markov property applied at T, 

limsupP {Xfas{y*) — P^h^ 1 / = y*) > 2e 1 /(3n)) > P i(l-pi). 

AT—>-oo 

Note that kiS < 2 1 and let mi = |~2t/h] + 1. It follows that for some non-random 
0 < k < mi, 

limsupP (Xk S (y*) - P X (Y$ = y*) > 2e x /(3n)) > pi(l 

TV—>-oo 

This contradicts Remark 14.21 applied at the time k5. The contradiction completes the 
proof. □ 

Recall the Prokhorov distance between probability measures on the Skorokhod space 
(see [4J p. 238]). Convergence in the Prokhorov distance is equivalent to the weak con¬ 
vergence of measures. 
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Corollary 5.3. Assume that Xq =>■ X as N —» oo for some probability measure X on F. 
Fix any ti,t 2 > 0. Consider a Fleming-Viot process {X t ,t > s} with N particles, where 
s G [0,ti]. For any e > 0 there exist 5 > 0 and N\ such that for all N > N\, every 1 < 
m < N, s G [0, ti] arid all initial distributions ofX. satisfying X s (y) — P^Y/ = y) <5 


for ally G F, the Prokhorov distance between the distribution of{M™,t G [s,s + t 2 ]} and 
the distribution of the Poisson process with intensity X t , given by (15. 2\ . on the interval 
[s, s + t 2 ] is less than e. 


Proof. Suppose that the corollary is false. Then there exist e > 0 and sequences (X fc ) fc > 1; 


(Sk 
0, 


fc>i, (Sk)k> i and (N k )k> i, such that we have Sk G [0,£i] for all k, Nk -A oo , 5k —$■ 
Xf(y)-P*(Yff = y) 


< Sk and the Prokhorov distance between the distribution 
of t G [sfc,Sfc + t 2 ]} and the distribution of the Poisson process with variable 

intensity X t on the interval [sk, Sk + t 2 \ is greater than e. By compactness, we can hnd 
a convergent subsequence of (sfc)fc>i. By abuse of notation, we will assume that Sk —* 
Soo £ [0,fi]. This and the continuity of the transition probabilities of Y imply that 
Xf (y) — P*(Y/ r — y) —> 0 for all y G F. Let X^ fc := X^ fc o 9 Sk . An application of 


Proposition 15.21 to processes X iVfc shows that the distribution of {Mjk' m ,t G [sk,Sk + ^ 2 ]} 
converges to the distribution of the Poisson process with variable intensity A t+ Sao ° n the 
interval [0,^]- This contradicts the assumption made at the beginning of the proof. □ 


Theorem 5.4. Assume that X is a probability measure on F with X(x) > 0 for allx G F. 
Suppose that X^ =>■ X as N —» 00 . The distribution of AT J converges to the distribution 
of the Poisson process with intensity 2X t when N —> 00 , where X t is given by (15.21) . 


We have assumed that X(x) > 0 for all x G F for technical reasons. We expect the 
theorem to hold without this assumption. 


Proof of Theorem \5.4\ Fix any ti > 0 and let mi = |~ti/h] + 1, where 6 will be specified 
later. It would suffice to prove the following assertions. 

(1) For every £ > 0 there exists hi > 0 such that for every 5 G (0, hi) there exists N\ 
such that for N > Ni and m — 1,2,, mi, 

P = 1 I ^ [(1 “ £)2hA( m _i) 5 , (1 + £)2hA( m _i) 5 ]. (5.5) 

(2) There exist c and hi > 0 such that for all h G (0, hi) there exists N\ such that for 
N > N\ and rri — 1,2,..., mi, 


p (Mis - Mf m _ 1)S > 1 | < o5 2 . (5.6) 

Our strategy will be to prove estimates of the type ('15.5)) - (15.61) but our argument will 
be a little bit more complicated. 

Since F is finite, we have 


Ci := n sup q xy < oo. 

x£F,y€E 


(5.7) 
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It is easy to see that for every probability measure X on F with X (x) > 0 for all x G F 
and e > 0 there exist c 2 , hi > 0 and c 3 < oo such that for all h G (0, hi], t G [0, 2ti], and 
x G F, if Y 0 has the distribution X then, 


P iXt = x ) > c 2, (5.8) 

sup P(Y/ = x) < (1 + e) inf P(Y/ = x), (5.9) 

sG[t,t+(5] sG[t,t+S] 

sup A s < (1 + e) inf A s , (5.10) 

se[t,t+<5] se[i,i+<5] 

C 2 < At < c 3 . (5.11) 

Let M t = By (15.3[) and (15.101) . for any e > 0, some hi > 0, for every 

h G (0, hi) there exists N\ so large that for N > iVi and 1 < m < mi, 

P ((1 — e)A^hA( m _i)5 < M mS — M( m _i)5 < (1 + e)N8\ m -i)s) > 1 — h 4 . (5.12) 


By (15.31) . for any £ > 0, some hi > 0, for every h G (0, hi) there exists N\ so large that 
for N > Ni and 1 < m < m\, 


P 


( sup sup 

x£F s£[(m-1)5,m5] 


X^x) - F(V'/ = x)\ 

p(y/ = x) 



> 


i-h 


4 


(5.13) 


and, trivially following from the last formula, 


P 


sup 

x G.F 




w - nvz: 




(m—l)S 


p(y!r,iV/=*) 



> 1 - h 4 . 


(5-14) 


Let Ai be the event in the last formula. 

Let Ri be the number of k such that there was exactly one branching event and at least 
one jump along H^ s on the interval [(m — l)h, mS], and y(fc, mh, mh) = y (fc, mh, {m — l)h). 
Recall the definition of Ci from (15.71) . The intensity of jumps of any process X 7 at any 
position is bounded by cr < oo. It follows that for any £ > 0, the probability that a 
process X 7 will jump at least once on the interval [(m — l)h, mh] and some other process 
will jump onto A"- 7 on the same interval (i.e., U l k k = j for some t*, G [(m — l)h, mh]) is 
bounded by 2c 3 h ■ 2cih, for small h. We can assume that £ < 1 in (15. 12[) so we see that 
there exists C 4 such that for some hi > 0 and all h G (0, hi) there exists N { so large that 
for N > Ni and 1 < m < m\, 


ERi < c 4 Ah 2 , E R\ < c 4 Ah 2 . 


This and (15. lip imply that for any £ > 0 , some hi > 0, for every h G (0, hi), there exists 
N\ so large that for N > Ni and 1 < m < mi, 

P(R 1 >Ne5) = p(%>(NeSf)< ^ < V (5.15) 

Let R\ be the number of k such that there was exactly one branching event and at least 
one jump along H^ s on the interval [(m — l)h, mh], and at least one jump ocurred later 
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than the branching event. An argument very similar to that leading to (I5.15|i shows that 
for any e > 0, some Si > 0, for every 5 G (0, hi) there exists Ni so large that for N > Ni 
and 1 < m < mi, 

P(Ai > Ne8) < 8 4 . (5.16) 

Let R± be the number of k such that there was exactly one branching event and at least 
one jump along H^ nS on the interval [(m — l)h, m6\. Then R\ < 2Ri + R,\. We combine 
(15.15P and (15. 16|) to conclude that for any £ > 0, some hi > 0, for every h G (0, hi) there 
exists N\ so large that for N > N\ and 1 < m < m±, 

P(i?i > 3Ne8) < 2h 4 . (5.17) 

Since £ > 0 is arbitrarily small in the last formula, 05.81) - 05.lip . 05.141) and 05.17p imply 
that for any £ > 0, some hi > 0, for every h G (0, hi) there exists N\ so large that for 
N > Ni and 1 < m < mi, 

P (Ri > NE 8 X (m _i )s inf (a;)) < 3h 4 . (5.18) 

Recall the set A, function C, and the notions of branching and offspring for elements of 
A introduced in Section [All We will consider a branching process B whose first generation 
of individuals consists of a G A such that a = £(A"( m _ 1 ^) for some i. The branching 
process includes all descendants /3 G A of the first generation individuals provided f3 = 
jC(X j s ) for some j and (m — l)h < s < m8. An individual has a pair of offspring on 
[(m — l)h, mS] with probability not greater than 2cih, for small h. Otherwise it has no 
offspring. It follows that the expected number of individuals in the j-th generation is 
bounded by N(2ciS) j ~ 1 . 

Let i ?2 be the number of k such that there were exactly two branching events along H 
on the interval [(m — 1 ) 8 ,m 8 ). Let R3 be the number of k such that there were exactly 
three branching events along H^ lS on the interval [(m — 1 ) 8 ,m 8 \. Let R 4 be the number 
of individuals in the fifth and higher generations of B. Unlike in the case of R,\ we do not 
impose any conditions on the number of jumps. 

Note that the inequality 5 < e/{3c\) is equivalent to Ne8 > N(2ci8) 2 + Ne8 /2 so either 
one implies that P(i ?2 > Ne8) < P(i? 2 > N(2ci8) 2 + Ne8/2). 

Standard formulas imply that the variance of the number of individuals in the third 
generation of B is bounded by AN(2ci8) 2 for small 8. This implies that for any £ > 0, 
some <5i G (0, £/(8cf)), for every 8 G (0, Ai) , there exists N\ so large that for N > N\ and 
1 < m < mi, 

P(fl 2 > NeS) < P (R 2 > N(2dif + NeS/2) < -^<«*. (5.19) 

The inequality 8 < ^/e/{^c^) is equivalent to Ne8 > N(2ci8) 3 + Ne8/2 so either one 
implies that P (R 3 > Ne8 ) < P(i ?3 > N(2ci8 ) 3 + Ne8/ 2). 
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The variance of the number of individuals in the fourth generation of B is bounded by 
4iV(2ci<5) 3 for small h, so for any e > 0, some hi G (0, yfe /(4c^ 2 )), for every h G (0, <5i), 
there exists N\ so large that for N > N\ and 1 < m < mi, 

P(R 3 > NeS) < P (R, > N(2 Cl S) 3 + NeS/2) < ^ S A 20 ) 

For some hi > 0, for every h G (0, hi) there exists N± so large that for N > N± and 
1 < m < mi, 


E(i? 4 ) < ^JV(2ci5) J ' < 2N(2ci5)\ 
j> 4 

This and (15.111) imply that for any e > 0, some hi > 0, for every h G (0,hi) there exists 
Ni so large that for N > Ni and 1 < m < mi, 

P(i? 4 > NeS ) < 2N ^ C £ f 4 = 4c?h 3 /£ < h 5 / 2 . (5.21) 

The same justification which enabled us to conclude (15.181) from (I5.17|) also gives the 
following estimates, based on (j5.19j) . (I5.2(jp and (I5.2ip . For any £ > 0 there exists hi > 0 
such that for all h G (0, hi) there exists Ni so large that for N > Ni and 1 < m < mi, 

P ^R 2 > NeS\ {m -i )5 inf ^_ 1)5 (x)^ < h 4 , (5.22) 

P Us > NeSX im _i )5 inf ^_ 1)5 (xA < h 4 , (5.23) 

P (Ri > NeSX {m _i )5 inf ^_ 1)5 (x)^ < h 5 / 2 . (5.24) 


Let i? 5 ]3 , be the number of particles that jumped to x on the interval [(m — 1)8,mS]. 
Since the particles which exit from F jump to the position of a uniformly chosen particle 
in F, (15.81) . (15.91) . (15.12p and (I5.13p imply that for any £ > 0, some hi > 0, for every 
h G (0, hi) there exists N x so large that for N > Ni and 1 < m < mi, 

1-^5,£ hAf m i 


P 


sup 

x£F 




8X {m - 1)s NX { l_ 1)5 (x) 


< £ > 1 - h 4 . 


(5.25) 


Let A 2 be the event in the last formula. 

Let Ri )X be the number of k such that there was exactly one branching event along H^ nS 
on the interval [(m — l)h, mh], and H^ nS (m8) = x. Note that 


Rb,x — Ri — — R 3 — -R 4 < Rl,x < 2 i? 5 i3 . + Ri. 

This, (15.181) . (I5.22p . (I5.23p . (I5.24p and (I5.25p . imply that for any £ > 0, some hi > 0, for 
every h G (0, hi) there exists N\ so large that for N > Ni and 1 < m < mi, 


\Ri,x 2hA( m _i)5A^T’^ i _ 1 ^(a:)| 

1 S hA (m _i ),iV^_ 1)5 (x) 


< 5£ > 1 - 2h 5/2 . 


(5.26) 


Let A 3 be the event in the last formula. 
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Let be the number of particles that were located at x at time (m — 1)5 and jumped 
from x to some other state on the interval [{m — 1)5, mS\. Recall the definition (15.7ft of C\ 
and that of the event A\ in (15. 1411 . We use (15.8(1 and (I5.14j) to see that for any e > 0, some 
5i > 0, for every 5 G (0, 5i) there exists iVj so large that for N > TV) and 1 < m < m 4 , 


Rt 


6,x 


p 1 S 2c 1 5NX i l_ 1 )s (x) 


R 


6,x 




> 1 


A\ (5.27) 


< 5 4 + n max E 

xG F 


( c.SNX^x) 


A, < _V> 


Let A 4 be the event on the left hand side in the last formula. 
Suppose that A 2 fl A 4 holds. Then for every x G F, 


NXZ(x) < ivy" _ 1)S M + Rs, x < NX ( l_ 1 )s (x) + (1 + e) 6 \ m - 1 )s NX? m _ lv (x) 


and 


NXZ(x) > NX? m _ 1 )s (x) - Rx, x > NX? m _ 1 )s (x) - 2 c,SNX^_ 1 )s (x). 


>N 


N 


>N 


Hence, 


1 - 2ci5 < 


XXM 

"X^ix) 


< 1 + (1 + e)5A( m _i)5. 


If in addition A? holds then 


(2 5£')5A( r7l i)5 


< 


R 


1,X 


< 


(2 + 5e)5A( m _i)5 


(5.28) 


1 + (1 + £)5A (m _ 1)5 - NX£s(x) ~ 1 - 2 Cl S 

Let A 5 = A 2 fl A 3 fl A 4 . By (15.25(1 . (15.26(1 and ((5.27(1 . we have F(Hg) < 35 5 / 2 for small 
5, so for every m there exists an event B m G R(m-\)s such that 

P (B m ) > 1 - 25 5/4 (5.29) 

and on the event B m , 

E (1a§ I ^(m-Di) < 25 5/4 . (5.30) 

Let C% be the intersection of {H^ 5 (m8) = x} and the event that there was exactly 
one branching event along H^ lS on the interval [(m — 1)6, md). Let Cj be the event that 
there was exactly one branching event along the spine on the interval [(m — 1)5, mS ]. Let 
= {£(J N ,mS) = £( X^ s ), J^ s = x}. In the following calculation we use the Markov 
property applied at time m5 and (15.281) . 

N 

P(Cj | F(m-l)8) = EE P (D X k n Cl | Rirn-l^) 


x&F k =1 


N 


E EE f (^nQ Xml) 


\xGF k =1 


(m— 


(m— 1)<5 
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N 


(m,— 


mS ) 


= E E E M-Pl FI I T ml ) 

\x£F k =1 

j 

= E EE^WR' 

\x£F k =1 

( N 

+ E EE lc^nAgI I> (-Dfc | Fms) 

\xe F k= 1 
/ A 

<* EE lc t nA,V(Dt | .F, 


(m—1)5 


(m,— 


(m—1)5 


(m— 1)5 


ra5 7 


k =1 
N 


^ F | Wl P (Zm<5 - x I m.5) 
^ E 1 2^ lc Z nA * - J^N - 


\xGF k =1 


E E 


P (^m5 = ® | ^5) 


\xeF 


NXZ(x) 


L m<5 
N 

E 1 

fc=i 


+ E (l^4g | J-( 771 — 1 ) 5 ) 

F [m—1)<5^ + E (1 a§ I F( m— l)^) 
F(m-1)5 ) + E (1a§ I F( m — i)«) 


CfnA 5 


/ P ( J mS ~ X I F m s) p 

E Z^ - KT V Nf„\ - F hx lA 5 


^xeF 


NX^x) 


F[rn—\)b I T E (l| F( m — 1 ) 5 ) 


< 


E ( 5 >(^ = *l Fms) 

(2 + 5£)JA (m _i)g f ^ 

1 - 2ci5 l ^ 

1 \xe f 

(2 + 5e)^A( m _i)5 


(2 + 5e)^A( m _i)^ 
1 - 2ciS 


Zs = x I F mS ) 


F(m~l)Sj + E (l A% I F(m-\)s) 
F(m— 1)(5 I E E (l^g | F(m— 1 ) 5 ) 


+ E (1a§ I F(m-l)s) ■ 


1 - 2dS 

This, (15.lip and (I5.30P imply that for any e > 0, some hi > 0, for every h £ (0, hi) there 
exists Ni so large that for N > Ni and 1 < m < mi, on the event B m , 

P (Cj | ^ (m _,) S ) < ( - + + E ( ! < (2 + fe)« ( 1){ + 2iJ5/4 

1 — 2c\d 1 — 2c\d 


< (2 + 6e)hA( m _i),5. 


(5.31) 


Let C'j be the event that there were at least two branching events along the spine on 
the interval [(in — l)h, m5\. A calculation similar to that for Cj but using (15.2211 - 05.241) 
instead of (15.26(1 shows that there exist events B' m £ F( m -i)s such that ¥(B' m ) > 1 — 2h 5 / 4 
and for any e > 0, some hi > 0, for every h £ (0, hi) there exists N\ so large that for 
N > N\ and 1 < m < mi, on the event B' m , 


p ( C'j | F( m -i)s) < 3£hA( m _i)^. 

We now derive the lower estimate for P(Cj | F( m -i)s), 

N 

E(Cj I F( m -i)s) = EE F(D x nC£ I F {m - 1)Z 

x€F k=1 


(5.32) 
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N 


~m6 


,x€F k= 1 

( N 

E E E IcjruiPtBf I ?. 

\x£F k =1 
N 

>MEEwmi^ 


(m.— 


(m— 1)5 


(m,— 


(m—1)5 


\x£F k=l 
N 


(m .— 


(m—1)5 


F / Y^ Y^ i - X | J 7 ^) 

E I 2^ 2^ ^5-- 


\x&F k =1 


ME 

\xeF 

e (E 


KxGF 


it = ^ i a,,;) 

it = s I 

XXZ(x) 


N 

Ei 

fc=i 


•E— 


C?nA 5 


(m—1)6 


•Ei m _ 


(m—1)5 


-Rl,zl 


^5 


(m.— 


(m—1)6 


> E ( £>(4t = i | 

' ^ 1 + (1 + ) 5 


VXG E 

(2 5£')5A( m _i),5 

1 + (1 + e)5A( m _i)5 

(2 5£)(5A( m i)^ 


~Ef' 777.— 


^ ( ^ A5 

\ xG-F 

E (1^5 l^( m— 1)5) • 


mS — X I J-mS) 


(m— 1)5 


(m .— 


(m—1)5 


1 + (1 + e)<5A( m _i)5 

This, (15. lip and (I5.30p imply that for any e > 0, some <5i > 0, for every 5 G (0, hi) there 
exists Ni so large that for N > Ni and 1 < m < mi, on the event B m , 

(2 5£')hA( m i),5 


P (Cj | > 


1 + (1 + e)hA( m _i)5 


E (l a 5 l^( m— 1 ) 5 ) 


(5.33) 


> “ 2i5/4 ) S (2 - 

1 + (1 + e)hA (m _i) 5 

Let B * = Hi < m < mi (B m n B'J. We have 

P (Bt) < m i4h 5/4 = ([ti/h] + l)4h 5 / 4 < C5 h 1/4 . 


Let M be a Poisson process with intensity 2A t , independent of M J . We define M J by 
setting M'l — M^ m _ x s S = M( — M/ _ 1 - )5 for t G [(m — l)h, mS] on the event B' m D B m for 
m > 1. We let Mf - = M t - M( m _ 1)s for t G [{m - l)h, mS\ on ( B' m n B m ) c for 

m > 1. ft follows from (j5.31 j) . (I5.32p and (I5.33P that M J satisfies (j5.5p ~ (I5.6j) . Hence, the 
processes M J converge to the Poisson process with intensity 2X t as N —> 00. Since P(H*) 
can be made arbitrarily small by choosing small S, we conclude that the processes M J 
converge to the Poisson process with intensity 2X t as N —> 00. □ 


The next theorem is concerned with the distribution of a side branch of the spine 
of Fleming-Viot process. To this aim we consider two branching processes. The first 
































THE DISTRIBUTION OF THE SPINE OF A FLEMING-VIOT TYPE PROCESS 


19 


process, V, will be the branching version of Y with the deterministic branching rate A t 
defined in (15.21) . Then we will define a branching process Z representing descendants 
(along historical paths) of one of the components of X. The constructions are routine but 
tedious so we will only sketch them. 

Fix any probability measure X on F. Given x\ E F and 4 > 0, let {Y t ,t > 4} have 

the distribution of the process Y started from x\ at time 4 ■ Let Tp be the exit time 

of Y from F. Let U be an independent random variable with the distribution given by 
P {U > u) = exp /'" X t dt j for u>t\. We let ( = Tp A U. If t f < U then we let a = 0. 

Otherwise a = 1. Let V(ti,xi) denote the distribution of ^{F),4 <t< C},aj. 

Let B be the family of sequences of the form (4,4, ■■ ■ ,ik), where 4 = 0 and each ij 
is either 0 or 1 . If 13 — (4,4, • • • ,4) then we will write /3 + 0 = (4,4, ■ ■ ■ ,4,0) and 

(3 + 1 = (4,4, ■ ■ ■, 4, !)• We will say that (3 + 0 and (3 + 1 are offspring of f3. 

Fix any X\ E F and fi > 0. There exists a branching process V = V tl,Xl starting from a 
single individual with the following properties. Individuals V 13 in V are indexed by (3 E B. 
Every individual V 13 is a process {Vf ,sp < t < 4 } for some 0 < sp < tp < 00 . Let £>v 
denote the random set of all indices of all individuals in V. We always have (0) € B v- We 
call V 7 an offspring of V 13 if and only if 7 is an offspring of f3. If (3 E £?v then all ancestors 
of f3 are also in B V - If f3 E £>v has no offspring in £> v then we let = 0. Otherwise 
= 1. If 7 is an offspring of (3 then Vy = Vf . If (3 = (0) then the distribution of 
({Vt , sp <t < tp}, a^j is V(ti,Xi). For any other (3 E £>v, the conditional distribution of 

({Vt , sp <t < tp}, a^j given the distribution of all ancestors of V 13 is that of V(sp, Vf 0 ). 
Let V{ti,x 1 ) denote the distribution of V tl,Xl . 

Remark 5.5. We will now argue that the process V has a finite lifetime a.s. Let K(t ) be 
the number of individuals at time t. Suppose that the process V starts at time 0 and its 
starting distribution is randomized so that the position of the unique individual at time 
0 has distribution X. The branching intensity X t has been chosen so that the expected 
number of individuals is constant in time for this initial distribution, i.e., KK(t) = 1 for 
all t > 0. This implies that K(t ) cannot grow to infinity (in finite or infinite time) with 
positive probability. It follows that, with probability 1, for some C\ < 00 , there will be 
arbitrarily large times 4 with K(tk) < C\. A standard argument based on the strong 
Markov property shows that V has to become extinct within one unit of time of one of 
tk s (or earlier), a.s. Since this is an almost sure result, it is easy to see that it implies 
that Y s ’ x has a finite lifetime, a.s., for every s > 0 and x E F. 

Suppose that = X\ and let «o = x(44,4) £ A. Let Az be the family of all 

descendants a of a 0 in A such that a = £(X/) for some j E {1,..., N } and t > 4- It is 
elementary to see that there exists a one to one mapping V : Az —> B with r(«o) = (0), 

preserving parenthood, i.e., r(aq) is a parent of r(a 2 ) if and only if aq is a parent of a 2 - 
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We choose such a mapping T in an arbitrary way. Let £>z = Y(At). We let Z k,tl (t) be 
a branching process with individuals Z 13 for (3 G Bz- We call Z 13 an offspring of Z 7 if 
and only if f3 is an offspring of 7 . Every individual Z /3 is a process {Zf, sp < t < tp} 
for some 0 < sp < tp < 00 . If j3 — T(a) and a = ((ai, 61 ), (a 2 , b 2 ),..., (a m , b m )) then 
sp = inf{t > 0 : x(a m ,£,£) = a}, tp = sup{£ > 0 : y(a m ,t,f) = a} and Zf = Xf m for 
t G Note that Z k,tl has an infinite lifetime if and only if the spine J N is a part of 

this branching process. 

Consider any k > 1 and let u k be the time of the k- th branching point of the spine 
J N . Suppose that y( J, u k ) = j\ and note that there is a unique j 2 G {1,..., N } such that 
32 7^ 3i and x{j 2 , u k ,t) = x(ji,u k ,t) for all t < u k . Let Z k = Z j2 ’“ fc . 


Theorem 5.6. Assume that X is a probability measure on F with X(x) > 0 for all 
x G F. Suppose that X^ X as N —y 00 and consider the process Y°° with the initial 
distribution X. Fix any k > 1 and let p k (dt) be the distribution of the time of the k-th 
jump of a Poisson process with intensity X t , independent ofY°°. Then the distribution of 
Z k converges, as N —>■ 00 , to ^2 xGF Y>(t, x)Y > (Y t °° = x)pL k (dt). 

Proof. We will use notation and definitions from the proof of Theorem 15.41 Let 

(x) -p (y™/ = x)| 


^1 = 


sup 

x£F 


I V-Y 
I ^m5\ 


< £ 


p ( r r/ = x ) 

and note that this is almost the same as the event A\ in (15. 14[) except that m — 1 is 
replaced with m. Recall that m± is defined to be a function of S in the proof of Theorem 
15.41 and 5 is specified later in that proof. 

Recall that Cj is the event that there was exactly one branching event along the spine 
on the interval \[m — 1 )8,mS\. We obtain from (15.291) . (15.311) and (I5.33P that for any 

e > 0, some 5i > 0, for every S G (0, 5i) there exists N\ so large that for N > Aj and 

1 <m< mi, 

P(Cj) > (2 - 6£)5A (m _ 1)5 (l - 2<5 5 / 4 ), (5.34) 

< (2 + 6£)<5A (m _ 1)5 + 2S 5/ \ (5.35) 

This, (15.111) and (15.14|) imply that for any £ > 0, some <5i > 0, for every 5 G (0, <5i) there 
exists Ni so large that for N > Ni and 1 < m < mi, 


P(A / | Cj) = p (^i ncj) > p(Cj) - P(K 


> 


p (Cj) - P(C'j) 

(2 — 6£)<5A (m _ 1) 5(l — 2<5' 5 / 4 ) — 5 4 


(5.36) 


> 1 - 7e. 


(2 + 6£)5A( m _i)5 + 25 5 / 4 

Fix some Si > 0 and 1 < t < N. Consider the process Z e,si conditioned on {X f Si = 
Zi'4 = Xi}. It follows from Proposition 15.21 that the distribution of the first individual in 
this process, i.e., {zj:°\t > si} converges, as N — y 00, to the distribution of the process 
Y killed at the first jump of an independent Poisson process with intensity A t , for t > s 1 . 
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In other words, it converges to the distribution of the first individual in the the process 

■ysi,xi 

Next consider any family of sequences {fl\ = (0), fa, ..., /3 m } C B such that if /3j 
belongs to the family then the parent belongs to the family as well. By the Markov 
property applied at the death times of individuals, for any such family, for every (3j, 
Proposition 15.21 implies that the distribution of the individual labeled f3j in Z l ' Sl converges 
to the distribution of the individual with the same label in V S1,X1 . Moreover, we have 
convergence of the joint distribution of all individuals labeled , /3 2 , ■ ■ •, /3 m in the process 
Z (,S1 to the joint distribution of the similarly labeled individuals in V S1,X1 . By Remark 
15.51 the process V Sl,Xl has a finite lifetime so B\- is finite, a.s. This completes the proof 
that the distribution of Z e,si converges to that of V S1,X1 . 

By Theorem 15.11 and (15.141) . for any e > 0, some > 0, for every 5 £ (0, <5i), there 
exists Ni so large that for all N > Ni, 1 < £ < N and 1 < m < m i, 

P {x{J,m$)=t)<e. (5.37) 

We obtain from the above argument concerning the distribution of Z f,si combined with 
the Markov property applied at mS, Corollary 15.31 (15.361) and (15.371) that for any e > 0, 
some > 0, for every 5 £ (0,<5i) there exists N x so large that for all x £ F , N > Ni, 
1 < £ < N and 1 < m < mi, conditional on Cj and the event {Z^ s = x, %(J, m5) ^ £}, 
the Prokhorov distance between the distribution of Z e ’ m5 and that of V m5,x is less than e. 

Since the dehnition of j 2 (an element of the dehnition of Zk) does not refer to the 
post-Mfc process, the claim made in the last paragraph applies not only to a fixed £ but 
also to j) 2 ■ Hence, for any e > 0, some cp > 0, for every 5 £ (0, <5i), there exists N\ so 
large that for all N > Ni and 1 < m < mi, conditional on Cj, the Prokhorov distance 
between the distribution of Zj. and that of \" lS ’ x is less than e. This easily implies the 
theorem. □ 

Remark 5.7. It is not hard to see that the following “propagation of chaos” assertion 
holds: for any fixed k, the processes Zi, Z 2 ,..., Z& are asymptotically independent, when 
iV —> oo. 


6. Spine distribution for a fixed N 

The main result of this paper states that if the number N of particles of a Fleming-Viot 
process increases to infinity then the distribution of the spine converges to the distribution 
of the underlying Markov process conditioned not to hit the boundary. One could wonder 
whether the theorem must have the asymptotic character; perhaps the claim is true for 
every fixed N. This section is devoted to an example of a Fleming-Viot process with 
N = 2 particles such that the distribution of the spine is not the same as the distribution 
of the driving process conditioned on non-extinction. 
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Let Y t be a continuous-time Markov process Y t with the state space E = {0,1,2}, 
F = {1, 2} and the transition rate matrix 

'0 0 o' 

A= 4-6 2 . 

1 6 — 7_ 

Let X* = (A} 1 , Xf) denote the 2-particle Fleming-Viot process based on Y. Then X has 
the state space {(1,1), (1, 2), (2,1), (2, 2)} and the transition rate matrix 

"-4 2 2 0 

_ 7 - 13 0 6 

7 0 -13 6 

0 6 6 -12 

Therefore the stationary distribution n of X determined by it A = 0 is 

/ 7 2 2 2 \ 

71 ~ \13’ 13’ 13’ 13/ ' ^ 

Let Y' and Y" be independent copies of Y. The state space of (' Y', Y") is 

A = {(0, 0), (0,1), (0, 2), (1, 0), (2, 0), (1,1), (1, 2), (2,1), (2, 2)} 

and the transition rate matrix for (Y r , Y") is 

"o • • • 0 

0 ••• 0 

B = 0 4 0 4 0 —12 2 2 0 

00410 6 -13 0 2 

01004 6 0 -13 2 

00101 0 6 6 -14 

Let 

f(x, y) — P (Y" reaches 0 before Y' \ Yq = x, Y" — y ), x, y — 1, 2. 

Then / is harmonic with respect to B, i.e., for all (xi,^) € A, 

Y B((xi, x 2 ), (yi, y 2 )) (f(yi, y 2 ) - f(x i, x 2 )) = 0. 

(2/i,2/2)eA 

It follows from the definition of / that /(0,1) = /(0, 2) = 0 and /(1, 0) = /(2, 0) = 1. By 
symmetry, /(1,1) = \ and /(2,2) = It is elementary to check that 

/(1.2) = 4- / < 2 ' 1 ) = 4 

Let J t denote the spine of X t . The spine passes through X} (i.e., x(J, t) = 1) if an 
only if X 2 “jumps to 0” before X 1 , after time t. The probability of this event is the same 
as the probability that Y" will hit 0 before Y\ assuming that (!}}, Y'J) = (X}, Xf ). If 
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X* = (1,1) then J t = 1, and if = (2, 2) then J t = 2. If X t = (1, 2) then J t = 1 with 
probability /(1,2) = and, by symmetry, if X t = (2,1) then J t = 1 with probability 
A. Assume that X is in the stationary regime and recall the stationary probabilities for 
X given in (16.11) to see that 


W(Jt = 1) 


7 1 2 5 2 5 2 111 

— • 1 -)- — • — A — ■ — A — • 0 —-. 

13 13 13 13 13 13 169 


( 6 . 2 ) 


We will show that for a (generic) fixed t > 0, the distribution of Jt is not the same as 
the distribution of Y conditioned to stay in F until time t, and it is not the same as the 
distribution of Y conditioned to stay in F forever. 

Let denote the distribution of Y with the initial distribution /i and assume that 
/j(0) = 0. The transition probabilities of Y are given by 


so 


Pj := e tA = 


6 —3t 
7 e 


l e -10t 4 g -3 t _|_ 3 e -10t 


W ± 0) = 


K(Xt = i) = 


= 2) = 


9 -3 1 2 -lOt 6 —3t _ 6 -lOt 

7 e -r 7 e ? e ? e 


^A(l) + yh( 2 )^ e 3< + ^-/i(l) 
-^(1) + ~h(2)^ e 3< + ^-/i(l) 
^/i(l) + yMi 2 )') e 3< + fjj( 2) 


It follows that 


0 

2 - 3 1 _ 2 -101 

7 e ? e 

3 —3t , 4 -lOt 
7 e T 7 e 

- ^(2)) e" 10 *, 

- ^(2)) e" 1 *, 

- ^( 1 )) e- 1W - 


(6.3) 

(6.4) 

(6.5) 


F^(Y t = l\Y t ^0) 


= 1) 

W ^ o) 


2 (2/i(l) + 3/i(2)) + (3/t(l) - 6/i(2)) e’ 74 t 2 

3 (2/i(l) + 3/i(2)) + (/t(l) - 2/i(2)) e _7t ^ 3 


as t —* oo, regardless of the initial distribution /u. Comparing this value to (16.2D . we see 
that for large t the distribution of J t (in the stationary regime) is not the same as the law 
of Y t conditioned to stay in F until t. 

Next we will compare the distribution of J t with the distribution of the process Y 
conditioned to stay in F forever, i.e., Y t °°. Since 0 is an absorbing state, 


= x ) = lim F u( Y t = x I Y t+s ^ o), x = 1 , 2, t > 0 . 


By the Markov property, 

(Y, = 1 I Y t+ , # 0) 


_ ^ i)(i - jy (i,Q)) _ 

P„(Yt = 1)(1 - A'(1,0)) + P„(y« = 2)(1 - pr(2,0)) 

_ p„(y< -1 ) _ 

1 _ P Y (2 0) ■ 

p,(y« = i)+p,(y« = 2) 1 _ p V( 1 ; 0 ) 


( 6 . 6 ) 


lim 

s— >• OO 


1-P^(2,0) 

1-PJ(1,0) 


9 - 2e _7s 
hm -—— 

s— 100 6 + e ' s 


3 

2 ’ 


By (O, 
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so this and fl6.4jl - fl6.5jl imply that 


4 6 

P = 1) = - + - 


1 ) 


2 /^( 2 ) 


2/4(1) + 3/4(2) 

This probability converges to 4/7 when t —> oo. This value is different from that in (16.2p 
so for large t the distribution of J t (in the stationary regime) is not the same as the law 


of Y t °°. 
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